Bayesian nonparametric inference for heterogeneously mixing infectious disease models

Significance Mathematical models of infectious disease transmission continue to play a vital role in understanding, mitigating, and preventing outbreaks. The vast majority of epidemic models in the literature are parametric, meaning that they contain inherent assumptions about how transmission occurs in a population. However, such assumptions can be lacking in appropriate biological or epidemiological justification and in consequence lead to erroneous scientific conclusions and misleading predictions. We propose a flexible Bayesian nonparametric framework that avoids the need to make strict model assumptions about the infection process and enables a far more data-driven modeling approach for inferring the mechanisms governing transmission. We use our methods to enhance our understanding of the transmission mechanisms of the 2001 UK foot and mouth disease outbreak.

The computational advantage is that the prior ratio is equivalent to the inverse of the proposal ratio and hence the probability 30 of acceptance only depends on the augmented likelihoods ratio.

31
Updating the correlation parameters ρ = {ρ j,k }: We update each of the p(p − 1)/2 correlation parameters, one at a 32 time, using a random-walk Metropolis algorithm. That is, for a given pair of types j and k, we propose a new correlation value, 33 ρ j,k , by ρ j,k ∼ N ρ j,k , σ 2 ρ , where σ 2 ρ is a tuning parameter that can vary with j and k if necessary.

34
Denote by Σ the matrix Σ constructed using ρ j,k for the particular pair j and k that is being updated. If ρ j,k ∈ (−1, 1) and Σ is positive definite, then the proposed value is accepted with probability given by: otherwise, ρ j,k is rejected.

35
Updating the length scale parameter l: We update the (common across the p types) length scale parameter l using a 36 random-walk Metropolis algorithm.

37
Step 3 of Algorithm 1 for the MOC Model 1: Propose l ∼ N (l, σ 2 ); 2: Accept l with probability: where Σ denotes the covariance matrix in (1) where each of the covariance matrices Σ (i,j) , for i, j = 1, . . . , p, is constructed using l and σ 2 is a tuning parameter.

B. Updating Parameters Specific to the IGP Model.
For the IGP model we set ρ j,k = 0 for all j and k and allow the covariance 38 functions to have different length scale parameters, l1, . . . , lp. We assign independent Exponential prior distributions to lj, 39 j = 1, . . . , p with mean 1/χ l j . Hence, the target posterior density up to proportionality is given by: Updating the infection rate functions β = (β (1) , . . . , β (p) ): We update the infection rate functions for each type, one 44 at a time, using the same proposal mechanism as in the MOC model.
Step 2 of Algorithm 1 for the IGP Model Set β (j) = exp f (j) ;

5:
Accept β (j) with probability given by where δ ∈ (0, 1] is a tuning parameter that can vary with j if necessary.
Updating the length scale parameters l1, . . . , lp: We update the length scale parameters, one at a time, using a 46 random-walk Metropolis algorithm.
Step 3 of Algorithm 1 for the IGP Model Accept l j with probability given by: where Σ (j) denotes the covariance matrix for type j evaluated with length scale parameter l j , and σ 2 is a tuning parameter.

C. Updating Parameters Specific to the DB Model.
In the this model we first set f (1) as a baseline, to which we assign a GP prior with mean zero and covariance matrix Σ (1) . For j = 2, . . . p we then assume that where u (j) represents the discrepancy between f (j) with f (1) , u (2) , . . . , u (p) assumed to be mutually independent. We note that 48 the choice of the baseline is arbitrary due the labeling of the types and one may choose a particular type to be the basiline that 49 aids interpretation of the results.We further assume that covariance matrices of the particular discrepancies have individual 50 length scale parameters, l1, . . . , lp. The target posterior density up to proportionality is given by: Updating the infection rate functions β = (β (1) , . . . , β (p) ): We update the infection rate functions as a block as follows:
Updating the length scale parameters l1, . . . , lp: We update the length scale parameters, one at a time, using a random-walk Metropolis algorithm.

57
Step 3 of Algorithm 1 for the DB Model Propose l j ∼ N (lj, σ 2 );

3:
Accept l j with probability given by: where Σ (j) denotes the covariance matrix for type j evaluated with length scale parameter l j , and σ 2 is a tuning parameter.

D. Updating the infectious period distribution rate parameter.
Step 4 in Algorithm 1 is the same irrespective of the nonparametric model used for the infection rate function (MOC, IGP, DB). Having placed an Exponential prior distribution on γ, the full conditional distribution of γ is a Gamma distribution with shape parameter 1 + nλ and rate n j=1 (rj − ij): which can be sampled directly.

58
E. Updating the Infection Times. We treat the unobserved infection times as unknown parameters within a data-augmentation 59 framework (2). We start by proposing to update the label of the initial infective and the corresponding initial infection times 60 as follows:

Updating the remaining infection times
We write i + ij to denote the set of infection times with ij included and i − ij 62 to be the set with ij removed, for some j ∈ {1, . . . , n}.

69
We use the Mean Projection Approximation (MPA) to reduce the cost of repeatedly evaluating the inverse of covariance 70 matrices which are needed when evaluating the GP prior densities. The MPA, essentially, places a GP prior distribution on a 71 function over a pseudo data set and then projects it onto the full data set.

72
Denote the function of interest over the data set d by g and the function over the pseudo data setd byḡ. Recall the idea of the Projected Process Approximation (PPA) (see, e.g., 4), in which a joint GP prior distribution is placed on g andḡ such that Here, Σ d,d is a B × B matrix whose elements are the covariances between the values of the function g at two input values Adapting the idea of PPA to our setting, the MPA projects the functionḡ over the pseudo datasetd onto the full data set d 78 by using the mean of the conditional distribution of g|ḡ: In other words, the values of g over the full data set are deterministic functions of the values ofḡ over the pseudo data set. In 81 particular, we can obtain g using the inverse of Σd ,d rather than the inverse of Σ d,d , which would have been computationally 82 prohibitive.

83
Although some care is required to construct the pseudo setd to ensure that the points are sufficient in number and suitably 84 placed across the entire domain to capture the features of the function f , simulation studies in (5) suggest that the error 85 introduced by MPA is small, even when the size ofd is relatively small, e.g. 10% of d.

86
A. Implementing MPA for the MOC model. We demonstrate how to sample from the target posterior density of the MOC model 87 when the MPA is employed. We place the joint prior distribution of (2) on the functions f (j) andf (j) for each type j = 1, . . . , p. 88 We further assume that the joint distribution of f (j) andf (j) is independent of the joint distribution of f (m) andf (m) for any 89 m = j, m = 1, . . . , p. As mentioned above, the MPA exploits the conditional distribution of f (j) |f (j) and in particular each 90f (j) is projected to f (j) using its mean and this is done for the function of each type, independently of each other. Step 2 of Algorithm 1 for the MOC Model using MPA 1: Drawν (1) , . . . ,ν (p) from the joint GP prior distribution on the input spaced: (1,p) ρ2,1Σ (2,1) · · · ρ2,pΣ (2,p) . . . . . .
Updating the correlation parameters ρ = {ρ j,k }: We update each of the p(p − 1)/2 correlation parameters, one at a Denote byΣ the matrix Σ in (1) constructed on the input spaced and using ρ j,k for the particular pair j and k that is being updated. If ρ j,k ∈ (−1, 1) andΣ is positive definite, then the proposed value is accepted with probability given by: otherwise, ρ j,k is rejected.

97
Updating the length scale parameter l: We update the (common across the p types) length scale parameter l using a 98 random-walk Metropolis algorithm.

99
Step 3 of Algorithm 1 for the MOC Model 1: Propose l ∼ N (l, σ 2 ); 2: Accept l with probability: whereΣ denotes the covariance matrix in (1) constructed on the input spaced and with each of the covariance matrices Σ (i,j) , for i, j = 1, . . . , p, constructed using l and σ 2 is a tuning parameter.

100
We present here a numerical comparison for the simulation study with synthetic data for two types. We define the maximum absolute error for farm type m by is the posterior median of the model in question. These values, given in Table S1, show that the MOC model 101 performs best for the type 0 infection rate, and both the MOC and IGP models perform well for the type 1 infection rate.
102 Table S1. Maximum absolute error for the estimates for type zero and type one infection rates.

103
A. Simulation Study I. We chose 1,000 farms uniformly at random from the 2001 UK Foot and Mouth disease data set described in the main text, and simulated an outbreak assuming that all farms were of the same type and with infection ratẽ where dij denotes the Euclidean distance between farms i and j. We further assumed a Gamma distributed infectious period 104 with mean 6 days and standard deviation 3.46 days. We fitted six models to the data set consisting of the simulated infection 105 and removal times, with the only difference between the models being the assumption about the functional form of the infection 106 kernel (see Table 2 in the main text). As there is only one type of farm in the data set, the nonparametric models described in 107 the main text (IGP, MOC and DB) are identical and denoted by M6 in Table 2 in the main text. We fitted M6 with fixed 108 hyperparameters α = 6 and l = 3km. In addition, we fitted five parametric models denoted by M1, . . . , M5, with their model 109 parameters being inferred from the data assuming vague, independent Exponential prior distributions with rate 0.01. We 110 further assumed a Gamma infectious period distribution with shape parameter fixed as λ = 3 and inferred the rate parameter 111 γ, also placing a vague, Exponential prior distribution on γ with rate 0.01.
112 Figure S1 shows the fitted parametric functions using the posterior mean parameter values, the posterior mean of our To investigate the effect of a ring-culling strategy as part of a control measure, we followed (6) and again simulated outbreaks 124 from the posterior predictive distribution of the infection and removal times. In contrast to the method described above, this 125 time all farms within 3 km were also removed/culled, regardless of their infection status. To mimic a realistic response, we 126 assumed the authorities only began culling when 30 farms are infected, and before 60 farms were infected it was assumed 127 that the available resources were only sufficient to cull at a 1.5 km radius. Table 2 in the main text shows the results which 128 reveal that only our nonparametric model, M6, closely matches both the predicted mean final size and probability of a severe 129 outbreak (i.e. one in which at least 10% of farms were infected) with the values predicted from the true model M1.

130
B. Simulation Study II. We carried out a second simulation study to assess the ability of our method to estimate the probability 131 of a severe outbreak and the final size, the former again defined as an epidemic where at least 10% of the initially susceptible 132 farms were infected. We chose 1,000 farms uniformly at random from the 2001 UK FMD disease data, and simulated an 133 outbreak assuming that all farms were of the same type and infection rateβij = βij(dij) = 0.7 exp{−0.7dij}, where dij denotes 134 the Euclidean distance between farms i and j. Infectious periods were assumed to be Gamma distributed with mean 6 days and 135 standard deviation 3.46 days. The simulated outbreak lasted 66 days and 665 farms were infected. We considered three models, 136 the only difference between them being the assumption about the functional form of the infection function (see Table S2). 137 We fitted two parametric models denoted by M1 and M2, with their parameters, (θ1, θ2) and λ1 respectively, being inferred 138 from the data assuming vague, independent Exponential prior distributions with rate 0.01. Since there is only one type of 139 farm in the dataset, the nonparametric models described in the main text (IGP, MOC and DB) are identical and denoted by 140 M3 in Table S2. We fitted M3 with fixed hyperparameters α = 6 and l = 4km. We also assumed a Gamma infectious period 141 distribution with shape parameter fixed as λ = 3 and inferred the rate parameter γ, also placing a vague, Exponential prior 142 distribution on γ with rate 0.01.